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The close genetic and antigenic relatedness among the group 2 coronaviruses human coronavirus OC43 
(HCoV-OC43), bovine coronavirus (BCoV), and porcine hemagglutinating encephalomyelitis virus (PHEV) 
suggests that these three viruses with different host specificities diverged fairly recently. In this study, we 
determined the complete genomic sequence of PHEV (strain PHEV-VW572), revealing the presence of a 
truncated group 2-specific ns2 gene in PHEV in comparison to other group 2 coronaviruses. Using a relaxed 
molecular clock approach, we reconstructed the evolutionary relationships between PHEV, BCoV, and HCoV- 
OC43 in real-time units, which indicated relatively recent common ancestors for these species-specific 
coronaviruses. 


Coronaviruses (family Coronaviridae, order Nidovirales) are 
large, enveloped, positive-stranded RNA viruses with a typical 
crown-like appearance. Their viral genomes (27 to 32 kb) are 
some of the largest known among all RNA viruses (12). Based 
on genetic and serological relationships, coronaviruses can be 
classified into three groups (8). Group 2 coronaviruses include 
murine hepatitis virus (MHV), bovine coronavirus (BCoV), 
human coronavirus OC43 (HCoV-OC43), rat sialodacryoad- 
enitis virus, porcine hemagglutinating encephalomyelitis virus 
(PHEV), canine respiratory coronavirus, and equine corona¬ 
virus. 

PHEV was first isolated in 1962 in Canada from suckling 
piglets with encephalomyelitis (9, 18) and is now found to be 
widespread among swine worldwide, with frequent subclinical 
infections among swine. The virus has a strong tropism for 
epithelial cells of the upper respiratory tract and for the central 
nervous system (CNS) and is transmitted through nasal secre¬ 
tions (1). In addition to clinical signs of encephalomyelitis, 
vomiting and wasting disease can be another manifestation of 
PHEV infection in piglets (22). The clinical symptoms of vom¬ 
iting and wasting are assumed to be centrally induced by in¬ 
fection of the vagus nerve, but a possible further dissemination 
of the virus into the CNS may lead to centrally induced motoric 
disorders. 

In this study, we determined the full-length genome se¬ 
quence of the PHEV-VW572 strain and we reconstructed the 
common evolutionary history of PHEV and the closely related 
BCoV and HCoV-OC43. The PHEV-VW572 strain was iso¬ 
lated in Belgium in 1972 from the tonsils of two diseased pigs 
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obtained from a litter in which an outbreak of vomiting and 
wasting disease occurred without further progression towards 
CNS motoric disorders (23). The isolate was propagated in a 
primary porcine kidney cell line. To determine the full-length 
genome sequence, primers developed for sequencing of group 
2 coronaviruses, as described previously, were used (33). 

Multiple sequence alignments were prepared using ClustalX 
version 1.82 (30) and manually edited in GeneDoc (21). Max¬ 
imum-likelihood phylogenetic analyses were conducted in 
Tree-Puzzle 5.1 using the VT (Mueller-Vingron 2000) model 
of amino acid substitution and a gamma distribution to model 
among-site rate heterogeneity (29). The SimPlot program (ver¬ 
sion 3.2) was used to analyze the genetic distance of the com¬ 
plete genomes of PHEV-VW572, two BCoV strains (BCoV- 
LUN and BCoV-Mebus), and an HCoV-OC43 contemporary 
strain (HCoV-OC43 BE03) in reference to the complete ge¬ 
nome of the HCoV-OC43 ATCC strain, and this genetic dis¬ 
tance was plotted versus nucleotide (nt) positions (14). Di¬ 
vergence times were estimated using a Bayesian coalescent 
approach implemented with BEAST version 1.2 (6). A novel 
relaxed molecular clock model that allows rates to change 
among branches in an uncorrelated fashion was applied (5). In 
this approach, rates are sampled identically and independently 
from an underlying distribution, in this case an exponential 
distribution. Markov Chain Monte Carlo analysis chains were 
run for 35 X 10 6 generations using a Hasegawa-Kishino-Yano 
model of nucleotide substitutions with gamma-distributed rates 
among sites and using a constant population size as a tree prior. 
Mean estimates and credibility intervals for the continuous pa¬ 
rameters were obtained using Tracer (Rambaut and Drummond, 
2003, available from http://evolve.zoo.ox.ac.uk/); the burn in was 
set at 10% of the sampled states. Instantaneous nonsynonymous 
substitution (dN) and synonymous substitution (dS) rates were 
estimated using a maximum-likelihood sliding window approach 
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as previously described (13). A window size of 600 nt and a step 
size of 60 nt were used in the analysis. 

The complete PHEV genome comprises 30,480 nucleotides, 
excluding the 3' terminal polyadenylation tail, and has a GC 
content of 37.2%. The nucleotide sequence data were depos¬ 
ited in GenBank under accession number DQ011855. An anal¬ 
ysis of these data revealed a significant truncation of the group 
2-specific ns2 gene in PHEV in comparison with the ns2 gene 
in BCoV, HCoV-OC43, and MHV. The PHEV ns2 gene is 
only 585 nt in length, coding for a 194-amino-acid nonstruc- 
tural protein. The carboxy-terminal truncation of the PHEV 
ns2 protein is caused by a deletion of 211 nucleotides, present 
in the BCoV and HCoV-OC43 genes encoding this protein, at 
the 3' end of the gene. In the amino-terminal part of the ns2 
protein of group 2 coronaviruses, a cyclic phosphodiesterase 
activity has been predicted (16, 27). These viral cyclic phos¬ 
phodiesterase domains, which have also been predicted in 
toroviruses and rotaviruses, are, like their cellular counter¬ 
parts, believed to mediate the conversion of ADP ribose l"-2" 
cyclic phosphate to ADP ribose l"-phosphate, which is part of 
the processing of tRNA-splicing products (36). Although ns2 
has been shown to be nonessential for in vitro coronavirus 
replication, a role for ns2 in viral pathogenicity can be sug¬ 
gested, as has been demonstrated by the observation that the 
deletion of MHV ns2 significantly attenuates the virus when it 
is inoculated into mice (4, 26). Potential nucleotide binding 
domains have been identified in the amino-terminal part of the 
ns2 protein of MHV-A59 and BCoV, and similar domains can 
also be found in the PHEV ns2 protein (3, 15). 

A difference in length between the PHEV-VW572 ns4.9 
open reading frame (ORF) and those of two other PHEV 
strains (67N and IAF404; GenBank accession no. AY078417 
and AF481863) was found. The ns4.9 and ns4.8 ORFs are two 
ORFs present in the BCoV genome, located between the spike 
gene and the nsl2.9 ORF. In PHEV, a nucleotide deletion 
similar to the 290-nt deletion in HCoV-OC43 can be demon¬ 
strated (20, 33), leading to the absence of ns4.8 and a truncated 
ns4.9, encoding a 20-amino-acid protein in PHEV strains 67N 
and IAF404. In PHEV-VW572, this ns4.9 ORF codes for a 
protein of 24 amino acids and this has also been demonstrated 
for PHEV strain NT9 (31). A potential functional consequence 
of this observation, if there is one, is not yet known. Interest¬ 
ingly, a truncation of the BCoV ns4.9 protein represents a 
significant difference between bovine respiratory and enteric 
coronavirus isolates, suggesting a possible role of ns4.9 in tis¬ 
sue tropism preference (7, 24). Of the PHEV strains compared 
in this study, the PHEV-IAF404 isolate may be more invasive 
for the CNS, as it was reported to cause encephalomyelitis 
associated with paralysis in addition to the manifestations of 
vomiting and wasting disease (25). Therefore, it would be pos¬ 
sible that the ns4.9 protein plays a role in the ability or inability 
of a PHEV strain to disseminate into the CNS. In PHEV-67N, 
however, a ns4.9 protein of the same length as that in PHEV- 
IAF404 was found and this strain was isolated from subclini- 
cally infected older pigs. In experimental infections with PHEV- 
67N, this strain was shown to be pathogenic for the CNS of 
neonatal pigs (19). This could suggest that the ability of PHEV 
strains to cause motoric disease may be age dependent and 
that motoric disease can occur in only very young pigs. 
Whether there are true strain differences in the invasive ca- 
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FIG. 1. Maximum-likelihood tree of coronavirus ORFlb amino 
acid sequences. PHEV ORFlb (GenBank accession no. DQ011855) 
was compared to other coronaviruses on the amino acid level. Group 
1 includes human coronavirus 229E (HCoV-229E, AF304460), human 
coronavirus NL63 (HCoV-NL63, AY567487), porcine epidemic diar¬ 
rhea virus strain CV777 (PEDV, AF353511), and porcine transmissible 
gastroenteritis virus Purdue strain (TGEV, AJ271965). Group 2 in¬ 
cludes human coronavirus OC43 (HCoV-OC43, AY391777), bovine 
coronavirus Mebus strain (BCoV, U00735), murine hepatitis virus 
Penn 97-1 strain (MHV-Penn97-1, AF208066), and murine hepatitis 
virus A59 (MHV-A59, NC_001846). Group 3 includes avian infectious 
bronchitis virus Beaudette strain (IBV-Beaudette, M95169), avian in¬ 
fectious bronchitis virus BJ strain (IBV-BJ, AY319651) and avian 
infectious bronchitis virus LX4 strain (IBV-LX4, AY338732). Human 
coronavirus HKU1 (HCoV-HKUl, NC_006577) and SARS coronavi¬ 
rus Frankfurt-1 strain (SARS-CoV, AY291315) are shown. The out 
group includes equine Berne torovirus (EToV, X52374). The scale bar 
represents the genetic distance (nucleotide substitutions per site). 


pacify of PHEV strains for the CNS and whether the ns4.9 
protein plays a role in this remain to be investigated. 

In a maximum-likelihood phylogenetic tree, the close ge¬ 
netic relatedness between PHEV, BCoV, and HCoV-OC43 is 
evident from the well-supported monophyletic cluster of the 
three viruses (Fig. 1). Using a SimPlot analysis, we demon¬ 
strated that in more than two-thirds of the genome, the genetic 
distance between PHEV and HCoV-OC43 is similar to the 
distance between BCoV and HCoV-OC43 (Fig. 2). However, 
in the genome region containing the spike gene, the genetic 
distance of PHEV to HCoV-OC43 is significantly higher than 
the distance of BCoV to HCoV-OC43. 

Based on the spike gene sequence data and on nucleocapsid 
gene sequence data, we performed a relaxed molecular clock 
analysis of PHEV, BCoV, and HCoV-OC43 strains for which 
the date of sampling could be obtained (Table 1). In this 
analysis, we did not use sequence data from the ORFlab 
region, as these data are available for only a limited number of 
PHEV, BCoV, and HCoV-OC43 strains with known sampling 
dates. The mean evolutionary rate estimate of the spike gene 
in PHEV, BCoV, and HCoV-OC43 is 6.1 X 10 4 nucleotide 
substitutions per site per year, with a 95% highest posterior 
density (HPD) interval of 2.1 X 1CT 4 to 1.0 X 10~ 3 . For the 
nucleocapsid gene in PHEV, BCoV, and HCoV-OC43, the 
mean evolutionary rate is estimated to be 3.6 X 10~ 4 nucleo¬ 
tide substitutions per site per year, with a 95% HPD interval of 
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FIG. 2. Linear representation of the PHEV-VW572 genome (GenBank accession no. DQ011855) and SimPlot analysis of complete genome 
sequence data of PHEV-VW572, BCoV-LUN (AF391542), BCoV-Mebus (U00735), HCoV-OC43 BE03 (AY903459), and HCoV-OC43 ATCC 
(American Type Culture Collection, AY391777). Each point plotted is the percent genetic distance within a sliding window of 400 nt wide, centered 
on the position plotted, with a step size of 200 nt. Each curve represents a comparison of the sequence data of PHEV-VW572, the BCoV strains, 
and HCoV-OC43 BE03 to the reference sequence data of the ATCC HCoV-OC43 strain. HE, hemagglutinin-esterase gene; S, spike gene; E, 
envelope protein gene; M, membrane protein gene; N, nucleocapsid protein gene; I, internal ORF. 


1.1 X 1CT 4 to 6.3 X 10~ 4 . The ancestral PHEV strain diverged 
from the common ancestor of BCoV and HCoV-OC43, and 
this event could be dated back to around 1878 (95% HPD 
interval, 1747 to 1954) based on nucleocapsid gene sequence 
data. When spike gene sequence data were used in this 
analysis, this event was dated approximately 100 years ear¬ 
lier (1777; 95% HPD interval, 1558 to 1919). This reflects 
the higher genetic distance for PHEV relative to HCoV- 
OC43 in this gene (Fig. 2), which implies an elevated evo¬ 
lutionary rate for the porcine coronavirus lineage in the 
spike genomic region. A maximum-likelihood sliding win¬ 
dow approach was used to estimate dN and dS across the 
genome (data not shown). In the region containing the spike 
gene, dS is significantly higher than dN, indicating that 
mostly synonymous substitutions are responsible for the 
higher spike evolutionary rate in the PHEV lineage. The 
possibility of positive selection is therefore less likely, unless 
the synonymous substitutions would have been selected for 
their role in the secondary RNA structure of this genomic 
region. Another explanation might be a recombination 
event between an ancestral strain of PHEV and another 


hitherto unknown coronavirus. However, this hypothesis 
would not explain why an excess of synonymous substitu¬ 
tions is responsible for the high genetic distance in the 
PHEV spike gene region, and thus we cannot provide con¬ 
clusive evidence for these speculations. 

Whether the most recent common ancestor (MRCA) of 
PHEV, BCoV, and HCoV-OC43 was a virus replicating in a 
porcine, bovine, or human host, in all three species, or even in 
another species cannot be inferred from the present data, but 
we can speculate that interspecies transmission events have 
occurred prior to the emergence of PHEV, BCoV, and HCoV- 
OC43. The divergence of BCoV and HCoV-OC43 strains 
could be dated back to the end of the 19th to the beginning of 
the 20th centuries, in correspondence with our previous study 
(33). The time to the most recent common ancestor (TMRCA) 
estimates were relatively consistent when spike gene (1902; 
95% HPD interval, 1802 to 1956) or nucleocapsid gene (1910; 
95% HPD interval, 1812 to 1961) sequence data were used. 
Interestingly, the MRCAs of each of the species-specific 
strains, i.e., of the PHEV strains, the BCoV strains, and the 
HCoV-OC43 strains individually, were all estimated to have 
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TABLE 1. Date and area of sampling of porcine, bovine, and 
human coronaviruses to calculate the TMRCA 


Strain 

Sampling 

date 

Sampling area 

Reference 

PHEV-VW572 

1972 

Belgium 

23 

PHEV-67N 

1970 

Iowa 

18 

PHEV-IAF404 

1998 

Quebec, Canada 

24 

BCoV-LY138 

1965 

Utah 

35 

BCoV Mebus 

1972 

Quebec, Canada 

7 

BCoV Quebec 

1972 

Quebec, Canada 

11 

BCoV-F15-BECS 

1979 

France 

35 

BCoV M80844" 

1989 

Giessen, Germany 

34 

BCo V-OK05143 

1996 

Kansas 

7 

BCoV-LSU94 

1994 

Louisiana 

2 

BCoV-ENT 

1998 

Texas 

28 

BCoV-LUN 

1998 

Texas 

28 

HCoV-OC43 ATCC 

1967 

Salisbury, 

17 

HCoV-OC43 BE03 

2003 

United Kingdom 
Belgium 

32 

isolate 37767° 

HCoV-OC43 BE03 

2003 

Belgium 

32 

isolate 84020° 

HCoV-OC43 BE03 

2003 

Belgium 

32 

isolate 87309 

HCoV-OC43 BE03 

2003 

Belgium 

32 

isolate 89996° 

HCoV-OC43 BE04 

2004 

Belgium 

32 

isolate 19572 

HCoV-OC43 BE04 

2004 

Belgium 

32 

isolate 34364° 

HCoV-OC43 BE04 

2004 

Belgium 

32 

isolate 36638° 


° Strain for which the nucleocapsid gene sequence data are unavailable. 


existed in a recent past, i.e., only 50 to 60 years ago. These 
TMRCA estimates were relatively consistent when the analysis 
was based on spike gene data (for PHEV strains, 1942; 95% 
HPD interval, 1894 to 1968; for BCoV strains, 1944; 95% HPD 
interval, 1910 to 1963; and for HCoV-OC43 strains, 1944; 95% 
HPD interval, 1910 to 1963) or when nucleocapsid gene se¬ 
quence data were used (for PHEV strains, 1945; 95% HPD 
interval, 1894 to 1968; for BCoV strains, 1951; 95% HPD 
interval, 1921 to 1965; and for HCoV-OC43 strains, 1957; 95% 
HPD interval, 1936 to 1961). The isolation areas of the PHEV, 
BCoV, and HCoV-OC43 strains used in this analysis are dis¬ 
tributed across the North American and European continents, 
indicating that these coronaviruses might have spread, in their 
natural hosts, over a large geographical region in a relatively 
short period of time. Our analysis does not imply that the 
origin of this coronavirus lineage cannot be earlier than the 
MRCA, which relates to only the coronaviruses that are cur¬ 
rently circulating. Continual extinction events might have re¬ 
placed earlier lineages in these species (10). 

This study provides insights in the evolutionary relationships 
among the closely related group 2 coronaviruses PHEV, 
BCoV, and HCoV-OC43. The reconstruction of the evolution¬ 
ary histories of closely related viruses with different host spec¬ 
ificities might be useful for elucidating the processes of viral 
emergence as a result of interspecies transmission events, such 
as the emergence of the Severe Acute Respiratory Syndrome 
(SARS) coronavirus. 
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